Apparatus for determining DNA sequences by mass spectrometry

ABSTRACT

This invention relates to the apparatus, for sequencing natural or recombinant DNA and other polynucleotides. In particular, this invention relates to a method for sequencing polynucleotides based on mass spectrometry to determine which of the four bases (adenine, guanine, cytosine or thymine) is a component of the terminal nucleotide. In particular, the present invention relates to identifying the individual nucleotides by the mass of stable nuclide markers contained within either the dideoxynucleotides, the DNA primer, or the deoxynucleotide added to the primer. This invention is particularly useful in identifying specific DNA sequences in very small quantities in biological products produced by fermentation or other genetic engineering techniques. The invention is therefore useful in evaluating safety and other health concerns related to the presence of DNA in products resulting from genetic engineering techniques.

This application is a division of Ser. No. 07/209,247, filed Jun. 20,1988, now U.S. Pat. No. 4,903,059.

BACKGROUND OF THE INVENTION

This invention relates to the field of the determination of DNAsequences and the uses of automated techniques for such determination.

The ability to sequence DNA has become a core technology in molecularbiology, and has contributed greatly to the understanding of DNAstructural organization and gene function. The facility with which DNAsequencing may be accomplished will substantially affect the rate ofdevelopment of related technologies, including the production of newtherapeutic agents, useful plant varieties and microorganisms viarecombinant DNA technology and the understanding of human geneticdisorders and pathology through gene mapping and chromosomal sequenceanalysis.

Initially, researchers focused on reading the genetic code and thetranslating of the nucleotide sequence into the amino acid sequence of aprotein. This occurs by a process of DNA transcription into mRNA, andthen actual synthesis of the protein on ribosomes. In eucaryotic cells,large specific segments of the initial transcript of mRNA, termedintrons, are transcribed but are excised during an intermediaryprocessing step. Much of the chromosomal DNA is not translated, and itsspecific function is largely unknown. This "intervening" or intron DNAwas first thought to be excess genetic material. However, as biologistsbegin to unravel the details of cell differentiation and the processescontrolling gene transcription it is now believed that the specificsequences of certain portions of some of these large regions oftranscribed but untranslated DNA may also provide important regulatorysignals.

The potential applications which derive from DNA sequencing have onlybegun to be explored. On large scale, analysis of human chromosomal DNAis considered vital to understanding human pathological conditions,including genetic disease, AIDS and cancer, because often only subtledifferences, even single nucleotide substitutions, can lead to seriousdisorders. Serious consideration is now being given to the sequencing ofthe entire human genome--approximately 3 billion base pairs. The successof this project will depend on rapid, sensitive, inexpensive automatedmethods to sequence DNA.

The fundamental approach to determination of DNA sequence has been wellestablished. Restriction endonucleases are employed to cleavechromosomal DNA into specific smaller segments, and recombinant cloningtechniques are then used to purify and generate analyzable quantities ofDNA. The specific sequence of each segment can then be determined byeither the Maxam-Gilbert chemical cleavage, or preferably, the Sangerdideoxy terminated enzymatic method. In either case, a set of allpossible fragments ending in a specific base are generated. Theindividual fragments can be resolved electrophoretically by molecularweight, and the sequence on the original DNA segment is then derived byknowing the identify of the terminal base in each fragment.

In its broadest aspect, this invention is directed to methods andreagents for sequencing DNA and other polynucleotides. In particular,this invention describes reagents and methods for automating andincreasing the sensitivity of both the Sanger, Proc. Natl. Acad. Sci.USA, 74, 5463 (1977) and Gish and Eckstein, Science, 240, 1520-1522(1988), procedures for sequencing polynucleotides. The methods of thepresent invention are based on mass spectrometric determination of eachof the four component terminal nucleotide residues, where theinformation regarding the identity of the individual nucleotides iscontained in the mass of stable nuclide markers.

2. Summary Of The Prior Art

In the Sanger dideoxy method (Proc. Natl. Acad. Sci., USA, 74, 5463(1977)), the DNA to be sequenced is exposed to a DNA polymerase, a cDNAprimer, and a mixture of the four component deoxynucleotides, plus oneof the four possible 2,3-dideoxy nucleotides. The DNA to be sequenced istypically a single stranded DNA clone prepared in the phase vector M13,although Chen and Seeburg have disclosed a method for applying theSanger method to supercoiled plasmid DNA (DNA 4:165-170 (1985)). Inaddition Innis et al., Proc. Natl. Acad. Sci., USA 85, 9436-9440 (1988)have disclosed a method for direct sequencing of chromosomal DNAamplified by the polymerase chain reaction. For any DNA template,however, the principle behind the dideoxy chain termination methodremains the same. There is a competition for incorporation of the normaldeoxy- and the dideoxy-nucleotide by the polymerase into the growingcomplementary chain. When a dideoxy nucleotide is incorporated, furtherchain extension is prevented. Since there is a finite probability thatthis chain terminating event many occur at each complementary site ofthe appropriate base, a mixture of all possible fragments ending in thatdideoxy base will be generated. This mixture of fragments can beseparated by size via gel electrophoresis. When the experiment isrepeated with each dideoxy base, four mixtures of fragments, eachterminating in a specific residue are produced. When this set ofmixtures is chromatographed in four adjacent lanes, so that fragmentlengths in the four mixtures can be correlated with each other, thesequence of the original DNA is determined by relating the fragmentlength to the identify of the terminating dideoxy base.

Maxam and Gilbert, Methods in Enzymology, 65, 499-500 (1980), discloseda method for DNA sequencing using chemical cleavage. In this method,each end of a DNA fragment to be sequenced is labeled. This DNA fragmentis then cleaved preferentially at one of the nucleotides, underconditions favoring one cleavage per strand. This procedure is thenrepeated for each of the other three nucleotides. The four samples arethen run side by side on an electrophoretic gel. Autography identifiesthe position of a particular nucleotide by the length of the fragmentsproduced by cleavage at that particular nucleotide. This method suffersfrom the same drawbacks as the Sanger method.

The position of the fragment in gel electrophoresis is usually revealedby staining or by autoradiography. In autoradiography methods, thefragments have typically been labeled with ³² P or ³⁵ S radionuclideswhere either the DNA primer or one of the component deoxynucleotideshave been tagged, and that label incorporated in a specific or randomfashion. After fractionation of the fragment on acrylamide gels, thegels are used to expose films. This presents a number of difficulties.For example, the short half-life of ³² P requires that the sequencingexperiment be anticipated days in advance so that fresh label can beused. Additionally, the high energy beta radiation emitted by the ³² Pleads to scission of the phosphodiester linkages within the DNAfragments synthesized in the sequencing reaction and thus requiresimmediate fractionation of sequencing reaction products. The use of ³⁵ S(Ornstein, et al., Biotechniques, 3, 476 (1985), which has a longerhalf-life and less energetic emission somewhat ameliorates theseproblems, but requires much longer times of exposure to film for thedevelopment of a usable autoradiograph, often in the range of one tothree days. Whichever radionuclide is used, the fact that a single typeof label is used for each sequencing reaction requires that each set ofreaction products be fractionated in a separate lane on the sequencinggel. Common problems in running sequencing gels include uneven heatingand the presence of impurities, either of which can cause adjacent laneson the sequencing gel to run in an uneven fashion making the comparisonof fragment migration in adjacent lanes, and thus DNA sequencedetermination, difficult or impossible. The use of unstableradionuclides also poses a health risk to the investigator.

An alternate method of detection was developed by the CaliforniaInstitute of Technology group (Smith, et al., Nature, 321, 674 (1986))in which the terminal base residues are labeled with a fluorescentmarker attached to the DNA primer. In four fluorescent markers ofdifferent spectral emission maxima are used, then the four separate setsof polymerase fragments can be combined with co-chromatographed. Thismethod is also disclosed in EPO Patent No. 87300998.9.

A second variation of the fluorescent tagging approach has recently beenreported by the DuPont group (Science, 238, 336 (1987)) wherein a uniquefluorescent moiety is attached directly to the dideoxy nucleotide. Thismay represent an improvement over the CalTech primer tagging approach inthat a single polymerase experiment can now be run with a mixture of thefour dideoxy termination bases. However, one trade-off for thissimplification is potential replication errors by the polymerase,arising from mis-incorporation of the modified dideoxynucleotide baseanalogs.

These modified Sanger methods are an improvement over the originalSanger method in the extent to which DNA can be sequenced because thechromatographic ambiguities have been reduced. However, a number oflimitations are associated with the use of fluorescent labels in thesemodified Sanger reactions. In particular, there are chromatographicdifferences among fragments arising from the unique mobilities of thedifferent organic fluorescent markers. Moreover, there are difficultiesin distinguishing individual fluorescent markers because of overlap intheir spectral bandwidths. Finally, there is a low sensitivity ofdetection inherent in the extinction coefficients of the fluorescentmarkers.

All of the above variants of the Sanger method for sequencing have usedslab gel electrophoresis to effect size separation of the DNA fragments.The casting and loading of slab gels is a skilled but intrinsicallymanual operation. The only aspect of this process which has beenautomated with any success is the reading of the gel by certaincommercial devices with some type of laser scanner/spectrophotometer.

A labeling method is needed which eliminates chromatographic ambiguityby imparting to each sequencing reaction product its own specific tag,but in which this specific tag is "invisible" to the chromatographicapparatus, i.e., does not affect the chromatographic mobility of thedifferent sequencing products differentially. Additionally, a labeldetection system is needed which is much more sensitive than thefluorescence system, and which can make distinction in labels based uponcharacteristics which separate them discretely, rather than by trying todistinguish between broad overlapping traits. Ideally, s stable,non-radioactive label would be used eliminating the short usefullifetime of the label and products containing the label, as well aspotential health risks to investigators.

Eckstein and Goody, Biochemistry, 15, 1685 (1976), discloses a method ofchemical synthesis for adenosine-5'-(O-1-thiotriphosphate) andadenosine-5'-(O-2-thiotriphosphate).

Eckstein, Accounts Chem. Res., 12, 204 (1978), discloses a group ofphosphorothioate analogs of nucleotides.

Gish and Eckstein, Science, 240, 1520-1522 (1988), disclose analternative method for sequencing DNA and RNA employing base specificchemical cleavage of phosphothioate analogs of the nucleotides whichwere incorporated in a cDNA sequence.

Japanese Patent No. 59-131,909 (1986), discloses a nucleic aciddetection apparatus which detects nucleic acid fragments which areseparated by electrophoretic techniques, liquid chromatography, or highspeed gel filtration. Detection is achieved by utilizing nucleic acidsinto which S, Br, I, or Ag, Au, Pt, Os, Hg or similar metallic elementshave been introduced. These elements are generally absent in naturalnucleic acids. Introduction of one of these elements into a nucleotideof a nucleic acid allows that nucleic acid or fragment thereof to bedetected by means of atomic absorption, plasma emission or massspectroscopy. However, this reference does not suggest or disclose anyapplication of the described methods of apparatus to the sequencing ofDNA, such as by the Sanger method. Specifically, it does not teach thata plurality of specific isotopes may be used to identify the specificterminal nucleotide residues. Nor does it teach that by total combustionof DNA to oxides of carbon, hydrogen, nitrogen and phosphorus, thedetection sensitivity by mass spectrometry for trace elements, such assulfur which is not normally found in DNA, is vastly improved. Thecombustion step, which is one aspect of the present application, isessential to eliminate the myriad of fragment ions from DNA. Thesefragment ions would normally mask the presence of trace ions of SO₂ inconventional mass spectrometry. What this reference does disclose isthat DNA may be tagged (by undisclosed means) with trace elements,including sulfur, as an aid to detection of DNA, and that these traceelements may be detected by a variety of means, including massspectrometry.

Details of DNA sequencing are found in Current Protocol In MolecularBiology, John Wiley & Son, N.Y., N.Y., F. M. Ansubel, et al., eds.,(1987), Chapter 7 of which is hereby incorporated by reference. Smith,et al., Anal. Chem. 60, 438-441 (1988), describes capillary zoneelectrophoresismass spectrometry using an electrospray ionizationinterface and is thereby incorporated by reference.

SUMMARY OF THE INVENTION

This invention relates to improved methods for sequencing DNA, DNAfragments, or other polynucleotides. The invention includes apparatus,reagents and mixtures of reagents for carrying out the method. Inparticular, this invention relates to the use of mass spectrometry toidentify the terminal nucleotide of a polynucleotide, based upon thepresence of a specific stable nuclide marker in the terminal nucleotideor the polynucleotide fragment containing that particular terminalnucleotide. The invention offers numerous advantages over previousmethods of sequencing polynucleotides, including greater sensitivity,increased signal specificity, simplified manipulation and saferhandling.

BRIEF DESCRIPTION OF THE FIGURES

FIG. 1 shows a schematic diagram of a complementary DNA sequenceattached to a primer DNA sequence and a typical series of chainterminated polynucleotide fragments prepared according to Scheme Eherein.

FIG. 2A shows in combination a column for separating DNA sequencesaccording to size and a means for sequentially transporting DNAsequences to a mass spectrometer.

FIG. 2B shows superimposed "ion current vs. time" printouts for ³²SO_(s), ³³ SO₂, ³⁴ SO₂, and ³² SO₂, resulting from combustion of a chainterminated DNA sequence.

BRIEF DESCRIPTION OF THE INVENTION

This invention relates to methods, reagents, apparatus and intermediatesinvolved in the determination of natural or artificially made("recombinant") DNA sequences and fragments thereof. This inventionseeks to eliminate numerous deficiencies in the prior art by embodyinggreater convenience, less chromatographic ambiguity, greater sensitivityand safer handling than existing procedures. In particular, thisinvention involves the determination of DNA sequences using acombination of chain termination DNA sequencing techniques and massspectroscopy. Thus, in a typical chain terminating DNA sequencingdetermination such as taught by Sanger, et al., Proc. Natl. Acad. Sci.USA, 74, 5463, (1977) involving a DNA primer,deoxynucleotidetriphosphates, dideoxynucleotidetriphosphates in thepresence of a DNA polymerase, such as Kenow fragment, are used todetermine the DNA sequence. However, in embodiments of the presentinvention the DNA primer, the deoxynucleotides or the dideoxynucleotidesare labeled with isotopes detectable by mass spectrometry to determinethe DNA sequence. For example, if the dideoxynucleotides (A, G, C, T)triphosphates, abbreviated as ddATP, ddGTP, ddCTP and ddTTPrespectively, are labeled with isotopes of different massesrespectively, and chain terminated fragments corresponding to thosefragments are separated and analyzed by mass spectrometry, a directcomponent of each dideoxynucleotide component of the chain terminatedDNA sequence is converted to a more convenient species for massspectrometry determination, i.e. sulfur isotopes are oxidized to sulfurdioxide. If the DNA primer or deoxynucleotides are labeled, reactionsbetween specifically labeled deoxynucleotides must be first carried outin the presence of a specific dideoxynucleotide. This is necessary sothat a specific label is associated with a specific chain terminated DNAsequence. Once the individual reactions are conducted, the chainterminated DNA sequences can be mixed, separated, and analyzed by massspectrometry because there will then be a specific relationship betweena specific isotope and the terminal dideoxynucleotide. This invention ismuch more sensitive than existing systems and therefore is especiallyuseful in determining the sequence of small quantities of DNA which arecontaminants in products resulting from fermentation and otherbiotechnology related processes, i.e. for "screening" applications. Theinvention also includes reagents and analytic instruments for carryingout the above methods as well as intermediate mixtures of chainterminated DNA sequences produced while carrying out the methods of thepresent invention.

DETAILED DESCRIPTION OF THE INVENTION

This invention relates to an improved method for sequencingpolynucleotides using mass spectrometry to determine which of the fourbases (adenine, guanine, cytosine or thymine) is a component of theterminal nucleotide. In particular, the present invention relates toidentifying the individual nucleotides in a DNA sequence by the mass ofstable nuclide markers contained within either the dideoxynucleotides,the DNA primer, or the deoxynucleotides added to the primer. Theinvention also includes reagents and analytical instruments for carryingout the above methods as well as mixtures of chain terminated DNAsequences.

In formation regarding the identity of the terminal base in a particularfragment may be signified by using a unique isotopic label for each ofthe four bases. The determination of which isotope marker is present,and thus which terminal base a fragment contains, can then be readilyaccomplished by mass spectral methods. Detection of ions by mass spectrais perhaps the most sensitive physical method available to theanalytical chemist, and represents order of magnitude better sensitivitythan optical detection of fluorescence.

If stable isotopes are chosen for labeling, then the isotope ratios arefixed by the mode of synthesis. The group of suitable atomic("nuclide"markers include those from carbon (¹² C/¹³ C), chlorine (³⁵Cl/³⁷ Cl), bromine (⁷⁹ Br/⁸¹ Br) and sulphur (³² S/³³ S/³⁴ S/³⁶ S).Since sulfur, chlorine and bromine are not normal constituents of DNA,i.e. they are "foreign", analysis for those foreign isotopes does notrequire consideration of their natural abundance ratio. It is noted thatsulfur is unique among this group in that it alone contains four stableisotopes, each of which can be used to represent one of the fournucleotide bases.

Further, if the fragments are subjected to combustion, then a lightvolatile derivative of the marker atom can be detected. Combustionconverts DNA to the oxides of carbon, hydrogen, nitrogen and phosphorus.The inclusion of a combustion step enormously simplifies the detectionof trace atoms because it eliminates the problem of producing andanalyzing high mass molecular ions.

With sulfur, combustion of the polynucleotide fragments in ahydrogenoxygen flame or pyrolysis tube will yield sulfur dioxide (SO₂).Thus, the terminal base of the fragment may be identified by determiningthe mass of the SO₂ ion as 64, 65, 66, or 68. This is a simpledistinction by existing mass spectral devices using either quadrupole orpermanent magnet analyzers. For a permanent magnet device, a set of fourpermanently fixed ion detectors can be mounted to continuously monitorthe individual ion currents. A quadrapole analyzer with a singleion-multiplier detector is presently preferred.

There are numerous ways in which a marker isotope could be incorporatedinto the complementary DNA fragment. These include substituting themarker isotope on the pyrimidine, purine, or ribose moieties, or thephosphate bridges between individual nucleotides. Further, the markerisotope may be contained in part of the cDNA primer, randomlyincorporated along the chain in one or several of the deoxy-base units,or specifically in the terminal dideoxy residue. The only restriction isthat the particular substitution be unique for that particular set offragments.

The site for the stable sulfur label is most preferably the phosphatebridge, using labeled thiophosphate in place of ordinary phosphate. Thetechnique for inserting a stable thiophosphate label in place ofordinary phosphate is similar to that employed in conventional ³⁵ Sradiolabeling experiments. The chemistry and enzymology of thepolymerase reaction using deoxynucleotideα-thiotriphosphates have beeninvestigated extensively. Any future developments in cloning vectors orpolymerase enzymes should also be able to utilize the thiophosphatederivatives of the present invention.

If the isotope label is to be incorporated into the cDNA primer orrandomly along the chain as a deoxy-base surrogate, then it is necessaryto perform a separate polymerase experiment with each of the appropriatedideoxy-base residues prior to mixing and chromatography. The advantageof using primer or intra-chain labeling is that several atoms of themarker isotope may be incorporated per mole of DNA fragment, and thusenhance detection sensitivity.

If, on the other hand, the isotope label is contained in the dideoxybase itself, then it is not necessary to perform individual polymeraseexperiments. Instead, a mixture of the four dideoxy bases, each with aunique isotope label, together with a mixture of four normal deoxy basesin stoichiometric ratios appropriate for the specific polymerase enzymecould be used to generate the complete set of labeled fragments in asingle polymerase experiment. Each fragment, regardless of its size,will contain one atom of the marker isotope on its terminal (dideoxy)nucleotide, wherein the marker isotope would indicate the identity ofthe terminal nucleotide.

Schemes A and B below illustrate typical sulfur and halogen labelingrespectively,

wherein, an asterisk (*) is used herein to indicate the presence of anisotopic label in accordance with the present invention;

wherein by A, T, G, and C is meant the base adenine, thymine, guanine,and cytosine respectively;

wherein by "Alk" is meant straight or branched chain lower alkyl of 1-6carbon atoms;

wherein by "S*" is meant a sulfur isotope of the group consisting of ³²S, ³³ S, ³⁴ S and ³⁶ S with the proviso that each isotope be uniquelyassociated with a member of the group consisting of A, T, G, and Crespectively; and

wherein by "X*" is meant a "halogen" isotope of the group consisting of³⁵ Cl, ³⁷ Cl, ⁷⁹ Br and ⁸¹ Br with the proviso that each isotope beuniquely associated with a member of the group consisting of A, T, G andC respectively. ##STR1##

Labeling Schemes C, D, and E below show three ways in which specificisotopes, designated as *1, *2, *3, and *4 can be uniquely associatedwith specific terminal nucleotides in a terminated complementary DNA(cDNA) sequence. For convenience in Schemes C-E, the "TP" designationfor the deoxy and dideoxy triphosphates has been deleted. In Schemes Cand D, the dideoxy chain terminating reaction is conducted separatelyand then the terminated chains are mixed prior to separation. Inparticular, Scheme D is a modification of the chemical cleavageprocedure of Gish and Eckstein, Science, 240, 1520 (1988) whereby theDNA fragment undergoes selective alkaline cleavage adjacent to thephosphothioate linkage, leaving a labeled deoxy compound as the terminalnucleotide in the fragment. In scheme D, one creates a series of suchfragments which differ from one another solely in size, via the presenceof an additional terminal nucleotide. Identification of each terminalnucleotide (via each isotopic marker) in relation to size of thefragment provides the base sequence of the DNA or polynucleotide ofinterest.

In Scheme E, a mixture of the four individually labeleddideoxynucleotide triphosphates, together with a mixture of the fourdeoxy nucleotide triphosphates are reacted together with the primer("P") in a single reaction. Because only one reaction and one separationneed be run in Scheme E, it can be readily seen that the labeled dideoxyscheme, Scheme E, is the preferred method of the present invention.Particularly preferred reactant in Scheme E are the labeled dd(A*, C*,G*, or T*) triphosphates where the labels ³² S, ³³ S, ³⁴ S and ³⁶ Sreplace a phosphate oxygen as shown in Scheme A.

                  SCHEME C                                                        ______________________________________                                        Labeled Primers                                                               ______________________________________                                        1. P.sub.A..sub.1 + d(A, C, G, T) + dd(A) =>                                                        P.sub.A..sub.1 dN.sub.- dN . . . dd(A)                  2. P.sub.C..sub.2 + d(A, C, G, T) + dd(C) =>                                                        P.sub.C..sub.2 dN.sub.- dN . . . dd(C)                  3. P.sub.G..sub.3 + d(A, C, G, T) + dd(G) =>                                                        P.sub.G..sub.3 dN.sub.- dN . . . dd(G)                  4. P.sub.T..sub.4 + d(A, C, G, T) + dd(T) =>                                                        P.sub.T..sub.4 dN.sub.- dN . . . dd(T)                  ______________________________________                                    

                                      SCHEME D                                    __________________________________________________________________________    Labeled Deoxy                                                                 __________________________________________________________________________    1. P + d(A, C, G, T) + d(A*.sup.1) + dd(A) =>                                                       P . . . dN.sub.- dA*.sup.1 dN . . . dd(A)               2. P + d(A, C, G, T) + d(C*.sup.2) + dd(C) =>                                                       P . . . dN.sub.- dC*.sup.2 dN . . . dd(C)               3. P + d(A, C, G, T) + d(G*.sup.3) + dd(G) =>                                                       P . . . dN.sub.- dG*.sup.3 dN . . . dd(G)               4. P + d(A, C, G, T) + d(T*.sup.4) + dd(T) =>                                                       P . . . dN.sub.- dT*.sup.4 dN . . .                     __________________________________________________________________________                          dd(T)                                               

                                      SCHEME E                                    __________________________________________________________________________    Labeled Dideoxy                                                               __________________________________________________________________________    1. P + d(A, C, G, T) + dd(A*.sup.1, C*.sup.2, G*.sup.3, T*.sup.4)                                      P . . . dN.sub.- dN . . . dd(A*.sup.1)                                        P . . . dN.sub.- dN . . . dd(C*.sup.2)                                        P . . . dN.sub.- dN . . . dd(G*.sup.3)                                        P . . . dN.sub.- dN . . . dd(T*.sup.4)               __________________________________________________________________________

Accordingly, the invention not only includes reagents but also a mixtureof unique isotopically labeled dideoxynucleotide triphosphates (ddC*TP,ddG*TP, ddT*TP and ddA*TP) where each dideoxynucleotide triphosphate islabeled with a different sulfur or halogen isotope. In particular,sulfur labeling, consisting of the isotopes ³² S, ³³ S, ³⁴ S and ³⁶ S,is preferred.

The invention also includes the intermediate mixture of dideoxychain-terminated DNA sequences in which each chain-terminated DNAsequence contains an isotope by mass spectrometry and in which eachisotope relates to a specific chain-terminating dideoxynucleotide. Thestable nuclide marker may be incorporated into the DNA primer, the DNAchain extended from the primer, or the dideoxynucleotide whichterminates the DNA chain extended from the primer. The invention alsoincludes the mixture of isotope-labeled, chain-terminated DNA sequencesseparated by size.

Although this invention has been discussed in terms of sequencing DNA,the method and the reagents of the present invention could also be usedto sequence RNA by providing reverse transcryptase as the polymerase.

FIG. 1 illustrates a complementary DNA sequence attached to a promoterDNA sequence and typical series of chain terminated polynucleotidefragments prepared according to Scheme E from mixtures ofdeoxynucleotide triphosphates and labeled dideoxynucleotidetriphosphates. These labeled fragments illustrate labeled chainterminated complementary DNA sequences 1 prepared by the method of thepresent invention, wherein the size of each complementary DNA fragmentcorresponds to the relative position of that fragment's terminalnucleotide in the overall complementary DNA sequence. These labeledfragments sequences are separated by size by an electrophoresis column2. The fragments from the electrophoresis column 2 are sequentiallyeluted to a detector 3. FIG. 2A shows in more detail the apparatus fordetermining a DNA sequence. DNA sequences are prepared in reactionchamber 4. The mixture of labeled terminated DNA fragments are separatedaccording to size by electrophoresis on a polyacrylamide gel column 5wherein migration occurs from the cathode (V⁻) to the anode (V⁺). Thefractions are taken off the polyacrylamide gel column 5 sequentially bysize at transfer point 6 where is provided a means 7 for transferringthe terminated DNA fragments to an oxidizer or combustion chamber 8. Inthe oxidizer or combustion chamber 8, the sulfur label is oxidized toSO₂ and the labeled SO₂ is detected in a mass spectrometer 9. FIG. 2Bshows typical superimposed "ion current v. time" plots for m/e 64, 65,66, and 68, corresponding to ions produced by the four stable isotopesof sulfur, i.e., ³² SO₂, ³³ SO₂, ³⁴ SO₂ and ³⁶ SO₂, respectively. Whenthe stable isotopes of sulfur associated with the bases, A, C, G and Tare ³² S, ³³ S, ³⁴ S and ³⁶ S respectively, a plot corresponding to theDNA sequence illustrated at the top of FIG. 2B is obtained. In thismanner, the DNA sequence of any genetic material can be determinedautomatically and on femto- or nanomolar quantities of material.

There are several variations in the design of an automated DNA sequencerof the present invention. The major components of the device are thereaction chamber for conducting polymerase reactions, thechromatographic device consisting of some form of electrophoresis, theeffluent transport, the combustion system, and the mass spectralanalyzer. Because this instrument is designed to operate on femto- andnano-molar quantities of DNA, it is important that the geometry of allcomponent systems be kept to a minimum size.

The chromatographic system may be of a laned plate or tubularconfiguration. In the plate designs, the supporting medium for thechromatographic separation will be most preferably a polyacrylamide gel,where the ratio of acrylamide to bi-acrylamide is more preferablybetween 10:1 and 100:1. Although persulfate is the typicalpolymerization catalyst used by most workers to prepare polyacrylamidegel plates, the background of sulfate ions may be unacceptably highwithout extensive washing. Ultraviolet irradiation can be usedsuccessfully to initiate cross-linking and produce high quality gels,which can be used immediately without washing.

For tubular designs, the chromatographic separation may be conducted ina gel-filled capillary or in an open tubular configuration. Thepreferred dimensions of the capillary depend on whether an open orgel-filled medium is selected. For gel-filled devices, preferreddiameters are 50 to 300 microns. In open tubular configurations,however, the preferred diameters are 1 to 50 microns. The preferredlength of the capillary depends on the diameter and the amount of DNAsample which will be applied, as well as the field strength of theapplied electrophoretic voltage. The preferred length is optimallybetween 0.25 and 5 meters.

For open tubular configurations, the capillary will preferably befabricated from fused silica. Under typical operating conditions, wherepH of the buffer is usually maintained in a range of 5.0 to 11.0, thesurface of the silica will have a net negative charge. This surfacecharge establishes conditions in which there is a bulk electrosmoticflow of buffer toward the negative electrode. The DNA fragments alsopossess a negative charge and are attractive to the positive electrode.Hence, they will therefore move more slowly than the bulk electroosmoticflow. In gel-filled devices, the supporting medium minimizeselectroosmosis. Since the gel has no charge, the negatively charged DNAfragments migrate toward the positive electrode.

There are a variety of techniques to modify the surface charge on thewall of an open capillary. One particularly useful method is tocovalently modify the wall with a monomer such asmethacryloxypropyltrimethoxysilane. This monomer can then be crosslinkedwith the acrylamide to give a thin bonded monolayer which is similar incharacteristics to the polyacrylamide gel-filled capillaries. Thedistinct advantage of the coated wall method is that the capillary canbe recycled after each analysis run simply by flushing with freshbuffer. (See S. Hjerton, J. Chromatography, 347, 191, (1985) fordetails).

The transfer system is selected to match the particular chromatographicdesign. The chromatographic system may be of laned plate or tubularconfiguration. It is desirable to have the chromatography effluent in aclosed environment. The tubular configurations may be more amenable tosample transfer designs which pump, spray or aspirate the columneffluate into the combustion chamber, and thus minimize degradation inresolution because of post-chromatographic remixing of fractions.

For the open plate type devices, a moving belt or wire system can beused satisfactorily. In this system, a thin coating of the columneffluent is spread on the ribbon to transport the eluted fractionsthrough a predrying oven and then into the combustion furnace. Thetransport ribbon may be fabricated from platinum or other noble metal,and may be continuously looped because the ribbon can be effectivelycleaned upon passage through the combustion furnace. Less preferably,the ribbon may be fabricated from a glass or ceramic fiber or carbonsteel. In this design, the ribbon would be taken up on a drum fordisposal.

For tubular configurations, the most preferred embodiment of thetransfer system is to use an electrospray nebulizer method to create afine aerosol of the column effluent. In this technique, a small chargeof optimally less than 3000 volts is applied to the emerging droplet.The charge of a larger droplet tends to disperse it into a very finemist of singly charged droplets. These fine charged droplets can befocused and directed by electric or magnetic fields, much as in anink-jet printer head. It is important to control the temperature of theflowing gas stream into which the aerosol is introduced. If thetemperature is too great, then the droplets will tend to evaporate onthe capillary injector. If the temperature is too low to overcome thesurface tension of the effluent, the individual droplets will not beadequately dispersed. The composition and ionic strength of thesupporting electrolyte is important. The preferred buffers are phosphateor tris-acetate at less than 0.1 M concentration. Because the overallmethod is based on detection of trace atoms, it is critically importantthat the buffers be free of contaminant ions, such as sulfate.

A second satisfactory method to create an aerosol of the column effluentis an ultrasonic device which produces sufficient material shear andlocal heating to disperse the droplet. A similar type of shear may alsobe generated off the tip of a capillary injector into a venturri typeaspirator, where the flow of supporting gas is developed by the pressuredifferential into the mass spectrometer. In these designs, it is oftendesirable to add small amounts of additional aqueous or organic solventsat the tip in order to aid in the flash evaporation of the effluent.

The combustion section is designed to completely burn the vaporized orsurface evaporated chromatography effluent of DNA fragments togetherwith the supporting electrolytes and optional solvent modifier to theoxides of carbon, hydrogen, nitrogen, phosphorous, and most importantly,those of the marker isotope. The sensitivity of detection of the markerisotope is greatly affected by the efficiency of combustion, since lowmolecular weight fragment ions which may result from incomplete burningcan mask the presence of the primary detection species. The sequencingpreparation must be free of other ions in the mass/charge range of 64 to68, since such ions would interfere with detection of the isotopicsulfur dioxide.

The combustion may be accomplished at moderate to approximatelyatmospheric pressure prior to injection into the mass spectrometer. Forsulfur containing streams, essentially complete combustion to sulfurdioxide will be achieved when the sample is heated to temperatures inexcess of approximately 900° C. in an oxygen environment.

The most rugged design is to simply aspirate the column effluent into ahydrogen-oxygen flame, similar in design to standard gas chromatographyflame ionization detectors. The important characteristics of the flame,the temperature and sample residence time, will be determined by theratio of hydrogen to oxygen, the aggregate flow rate of gases and thelocal pressure. The characteristics of sulfur containing flames at100-150 tor have been described by Zachariah and Smith, Combustion andFlame, 69, 125 (1987). A limitation to the sensitivity of this design isthe volume of gas (water vapor) resulting from hydrogen-oxygencombustion which effectively dilutes the sulfur dioxide. Although thestandard mass spectrometry techniques such as gas separators orsemipermeable membranes may be used to remove water vapor, there is atrade-off between sample dilution and ultimate detectability which mustbe considered for each design.

A preferred method to very efficiently burn the nebulized columneffluent is to inject it into a short heated tube in an oxygenenvironment. The tube may be constructed of nobel metals such asplatinum, ceramic or quartz, depending on the method of heating. Theexternal heating action may be provided by a cartridge electricalresistance heater or an external flame. The tube may be packed with aheat exchanger medium such as glass wool. Optionally, a catalyticsurface may also be provided by such materials as supported platinum orcopper oxide to enhance combustion efficiency.

A particularly effective method to burn the sample is in an inductivelycoupled oxygen plasma, where the tube forms the resonant cavity of amicrowave generator. The inductively coupled plasma techniques have beenreviewed by G. Meyer, Anal. Chem., 59, 1345A (1987).

Alternatively, the combustion may be affected within the low pressureenvironment of the mass spectrometer. A standard ionization technique inmass spectrometry is fast atom bombardment. An energetic beam of atoms,usually xenon, is produced in a plasma torch and directed toward thesample. Ionization occurs by collision induced dissociation of this beamwith the sample. If instead of pure xenon, oxygen is introduced into thefast atom beam, then both oxidation and ionization of the sample canoccur. The limitation of this method, however, is the difficulty ofachieving quantitative oxidation, and thus minimizing the backgroundsignal from incompletely oxidized low mass fragments.

There are several methods to effect ionization of sulfur dioxide. Theobjective is to obtain as high ion current as possible. In designs whichoperate at atmospheric pressure by flame, corona discharge needle ormicrowave induced plasma discharge, the ionization efficiency will bevery high. In this type of design, a portion of the ionized gas isintroduced into the low pressure region of the mass analyzer through asmall sampling orifice or skimmer cone. The size of the orifice, andthus the percentage of total combustion sample which can be introduced,will depend on the pumping speed of the vacuum system. Generally, lessthan five percent of the sample will be transferred to the analyzerregion. Designs of this type have been described by T. Covey, Anal.Chem., 58, 1451A (1986) and by G. Hieftje, Anal. Chem., 59, 1644 (1987).

Alternatively, the sample may be ionized in the lower pressure regionnear the analyzer. This may be achieved by such methods as electronimpact using the beam emanating from a hot filament, by fast atombombardment with inert gases such as xenon, or by chemical ionizationwith a variety of light gases. In these types of design, although theionization efficiency is low relative to atmospheric methods, a greaterpercentage of these ions actually get to the analyzer section. Theelectron impact techniques have been described by A. Bandy, Anal. Chem.,59, 1196 (1987).

An RF-only quadrupole mass filter may be used to help separate lowmolecular weight combustion products (H₂ O, N₂ and CO₂),

The analyzer and ion detector sections can be selected from severalcommercially available designs. The analyzer may be a quadrapole devicewhere mass electron depends on the trajectory in a hyperbolic field, afield swept electromagnetic device with a single ion detector, or apermanent magnet device with an array of four ion detectors tuned to theisotopes of interest. The detector may be of single stage or ionmultiplier design, although the latter type is preferred for highestsensitivity.

When the mass spectrometer is being used to detect isotopes of sulfurdioxide, as would be the case when dideoxy terminated thiophosphates arebeing utilized, the very high sensitivity is achieved when the polarityof the spectrometer is set to determine the positive ion spectrum.However, in glow discharge ionization, highest sensitivity is achievedwhen the spectrometer is operated in the negative ion mode.

When the mass spectrometer is being used to detect isotopes of chlorineor bromine, then maximum sensitivity will be achieved when thespectrometer polarity is reversed, and the negative ion spectrum isdetected.

The labeled compounds of the present invention are prepared byconventional reactions employing commercially available isotopes. Forexample, the sulfur isotopes: ³² S, ³³ S, ³⁴ S and ³⁶ S are commerciallyavailable as CS₂ or H₂ S from the Department of Energy (Oak Ridge, Tenn.or Miamisburg, Ohio) at "isotopic enrichments" of 99.8%, 90.8%, 94.3%and 82.2%, respectively. Although the method of the present inventionwill provide satisfactory results with the "isotopically enriched"commercial products, it is preferred that the sulfur isotopes be at99.5% enrichment to facilitate interpretation of the ion current v. timeplots.

Enrichment techniques for sulfur are well known in the art. Inparticular, CS₂ can be further enriched by fractional distillation whichtakes advantage of the different boiling points of CS₂ conferred by thevarious sulfur isotopes. Alternatively, gaseous diffusion of SF₆ alsocan provide further enrichment of the sulfur isotope of interest.Thereafter, the isotopically enriched CS₂ *, H₂ S* or SF*₆ are convertedinto the reagents described herein by techniques well known in the art.

The "halogen"isotopes ³⁵ Cl, ³⁷ Cl, ⁷⁹ Br and ⁸¹ Br are alsocommercially available from the Department of Energy in either elementalform or as the corresponding halide salt at enrichments of 99%, 95%, 90%and 90%, respectively. These isotopes are used herein in theircommercially available form.

As shown in Scheme F, labeled2',3'-dideoxynucleotide-5-(O-1-thiophosphates) III are prepared byinitially reacting a 2',3'-dideoxynucleoside I with isotopicallyenriched thiophosphoryl chloride (PS*Cl₃) in triethyl phosphate, whereinby "S*" is meant a sulfur isotope that is a member of the groupconsisting of ³² S, ³³ S, ³⁴ S and ³⁶ S in isotopically enriched form.From the above reaction, the correspondingly2',3'-dideoxynucleotide-5'-(O-1-thiotriphosphate) III is prepared fromII by dissolving the bistriethylamine (TEA) salt of II in dioxane andreacting it with diphenyl phosphochloriodate to form the diphenylphosphate ester of II. This phosphate ester was further reacted with thetetrasodium salt of pyrophosphate in pyridine to form III. Purificationof III is accomplished by chromatography on diethylaminoethyl (DEAE)cellulose.

Scheme G shows the general method for preparing3'-halo-2',3'-dideoxynucleosides V wherein the halogen ("X*") is amember of the group consisting of ³⁵ Cl, ³⁷ Cl, ⁷⁹ Br, and ⁸¹ Br inisotopically enriched form. In particular, a solution of1-(5-O-triphenylmethyl-2-deoxy-β-D-threopentofuranosyl)nucleosidewherein by nucleoside is meant A, T, G, or C, in a basic solvent, suchas pyridine, was reacted with methanesulfonyl chloride. The resultingmesylate was treated with an isotopically enriched salt, such as LiX*m,in the presence of heat and then acidified by produce a3'-halo-2',3'-dideoxynucleoside V. Compounds of Formula V can beconverted to the corresponding labeled nucleotide monophosphate VI byreaction with cyanoethyl phosphate and dicyclohexylcarbodiimide (DCC)followed by LiOH deblocking.

The corresponding triphosphate of V is prepared from the monophosphateas described from the monophosphate as described for the conversion ofII to III above.

Scheme H presents a method for halogenating a purine or pyrimidine baseof a 2',3'-dideoxynucleoside VII using isotopically enriched elementalbromine (⁷⁹ Br or ⁸¹ Br) or chlorine (³⁵ Cl or ³⁷ Cl), i.e. X_(x) *. The2',3'-dideoxynucleoside VII is dissolved in a polar solvent, such as dryDMF, in the presence of a base, such as pyridine. To the reactionmixture is added a molar equivalent of the elemental halogen (X₂ *) andthe reaction mixture is allowed to stir for 12 hours. Evaporation of thesolvent produces the labeled 2',3'-dideoxynucleoside VIII wherein theisotopic halogen label is on the purine or pyrimidine base of thedideoxy nucleoside. Purification is accomplished by conventionalchromatographic techniques.

The labeled 2',3'-dideoxynucleoside VIII is converted to thecorresponding monophosphate and triphosphate as discussed above for VIand III, respectively.

The examples described herein are intended to illustrate the presentinvention and not limit it in spirit or scope. ##STR2##

EXAMPLE 1 Preparation of [³²S]2',3'-Dideoxyadenosine-5'-Phosphorothioate

2'3'-Dideoxyadenosine (47 mg, 0.2 mmol) was suspended in triethylphosphate (0.5 ml) and heated briefly to 100° C. The solution was cooledto 4° C. and treated with [³² S] thiophosphoryl chloride (37 mg, 0.22mmol). The mixture was agitated for 12 hr. at 4° C., and then treatedwith 2 ml 10% barium acetate and agitated at 20° C. for 1 hour. Thesuspension was treated with 0.5 ml triethyl amine and then with 5 ml 95%ethanol. The suspension was agitated for 30 min. and then filtered. Theprecipitate was washed with 50% aqueous ethanol and then water. Thefiltrate was evaporated to dryness, and the solid taken up in water andchromatographed on a column of diethylaminoethyl (DEAE) cellulose whichhad been equilibrated with NH₄ HCO₃. The column was eluted with 0.1M NH₄HCO₃ and the fractions adsorbing at 260 nm were pooled and evaporated.The solid was evaporated twice with 80% ethanol, twice with 80% ethanolcontaining 2% triethyl amine (TEA), and finally with anhydrous ethanol.There was obtained 44 mg of the bis-triethylamine salt of the titleproduct. A solution of the triethylamine salt in 1 ml methanol wastreated with 1 ml of a solution of 6M NaI in acetone. The precipiate waswashed with acetone and dried to give 32 mg of disodium salt of thetitle product as a white solid.

EXAMPLE 2 Preparation of [³²S]2',3'-Dideoxyadenosine-5'-(O-1-Thiotriphosphate)

A solution of the bis-triethylamine (TEA) salt of the title product ofExample 1 (26 mg, 0.05 mmol) was dissolved in 1ml dry dioxane andtreated with diphenyl phosphochloridate (0.015 ml, 0.075 mmol). Themixture was agitated for 3 hr. at 25° C. A solution of dry pyrophosphatein pyridine was prepared by dissolving the tetrasodium salt (220 mg, 0.5mmol) in 3 ml pyridine and evaporating twice, and the taking up in 0.5ml pyridine. This solution was added to the above solution of the crudeactive ester, and stirred for 2 hr. The crude product was precipitatedby addition of ether (10 ml). The precipitate was dissolved in water andchromatographed in DEAE-cellulose eluted with 0.1 M triethylammoniumbicarbonate. The pool fractions contained 150 A₂₆₀ unites (20%) of thetitle product. The solution was lyophilized and the residue stored at-70° C.

EXAMPLE 3 Preparation of [⁷⁹ Br]3'-Bromo-2',3'-Deoxythymidine

A solution of1-(5-0-triphenylmethyl-2'-deoxy-β-D-threopentofuranosyl)thymine (50 mg,0.1 mmol) in pyridine (1 ml) was treated with methanesulfonyl chloride(0.014 ml., 0.12 mmol) and the reaction agitated at 10° C. for 6 hr. Themixture was then evaporated, diluted with CHCl₃ (5 ml) and washed 2×with water. The organic layer was evaporated, and the crude mesylatedissolved in dry diglyme (1 ml) and treated with [Li⁷⁹ Br] (17 mg, 0.2mmol). The solution was heated for 4 hr. at 100° C., and then dilutedwith 80% acetic acid (1 ml) and heated 15 min. longer. The reaction wascooled and diluted with water (2 ml), and extracted 3× with chloroform(2 ml). The organic extracts were evaporated and the residuechromatographed on silica gel eluted with 95:5 chloroform methanol togive 18 mg of the title product as an off-white solid.

This material could be converted to the triphosphate via themonophosphate as in Example 2. The monophosphate was prepared byreaction with cyanoethyl phosphate and dicyclohexylcarbodiimide (DCC),followed by LiOH deblocking.

EXAMPLE 4 Preparation of [⁷⁹ Br]2',3'-Dideoxy-5-Bromocytidine

To a solution of 2',3'-dideoxycytidine (60 mg, 0.3 mmol) in dry DMF (1ml) was added 0.1 ml pyridine and then [⁷⁹ Br] bromine (42 mg, 0.3mmol), and the mixture agitated for 12 hr. The solvent was evaporated,and the residue chromatographed on silica gel (ethyl acetate:methanol:triethylamine 90:10:1) to give 46 mg of the title product as awhite solid.

This material could be converted to the triphosphate via themonophosphate as in Example 2. The monophosphate was prepared byreaction with cyanoethyl phosphate and dicyclohexylcarbodiimide followedby LiOH deblocking.

What is claimed is:
 1. A DNA sequence analyzer comprising:(a) a meansfor separating chain terminated DNA sequences labeled with an isotopeaccording to size; (b) a combustion chamber which converts the elementsof chain terminated DNA sequence including the isotope into oxides; (c)a means for transporting the chain terminated DNA sequences from themeans for separating the chain terminated DNA sequences to thecombustion chamber; (d) a mass spectrometer operatively associated withthe combustion chamber for analyzing oxides of isotopes of differentmass bound to the chain terminated DNA sequences in a relationship thatassociated the mass of an isotope with a terminal nucleotide of the DNAsequence.
 2. A DNA analyzer according to claim 1 wherein the analyzer isdesigned and arranged to distinguish between the isotopes ³² S, ³³ S, ³⁴S, and ³⁶.